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1. Introduction 

Since its definition in 1988 by Aharonov, Albert, and Vaidman, the weak value pQ of a 
quantum operator A has been a source of considerable controversy. The formal weak 
value expression, 

w ~ faM ' (L1) 

was originally derived as the weak coupling limit of the shift in the mean of a 
Gaussian momentum pointer p under a specific von Neumann interaction Hamiltonian 
Hj{t) = —g{t)q®A that coupled the conjugate position pointer q to a system observable 
A subject to the double boundary conditions of a pure initial preparation state and 
a pure final post-selection state \ipf). Though the conditioned pointer shift lent itself 
to a natural interpretation as a conditioned average, the weak value expression (11.11) 
violated such intuition by exceeding the eigenvalue range of the observable A and even 
being complex. Despite later experimental confirmation of the effect [2], there was a 
feeling that such a strange quantity would prove to be an anomalous curiosity. 

Far from being an anomaly, however, the formal weak value expression ( 11.11) 
has persisted in the literature as a relatively stable quantity in a diverse array 
of systems. Ironically, the same features that made its interpretation troublesome 
have since been been fruitfully used to theoretically address a number of conceptual 
difficulties in quantum mechanics, including the three-box paradox, Hardy's paradox, 
superluminal travel, Bohmian trajectories, complementarity, macrorealism violation, 
and contextuality j3]. More practically, the inflation from the eigenvalue range has 
been exploited to amplify small signals above the background noise, in polarization and 
interferometric experiments [I]. 

Given its increasingly common presence in the literature, there was considerable 
motivation to find a firmer foundation under which the formal expression (jl.ip could be 
understood as a generally measurable feature related to an observable in a pre- and post- 
selected ensemble. Recently we provided such a foundation in the form of a Physical 
Review Letter [5] that indicated how the quantum weak value could be subsumed as 
an idealized special case of a more flexible operational formalism for the generalized 
measurement of observables, which we dubbed the contextual values formalism. Our 
Letter indicated that a principled generalization of the weak value, 

Tr (Ef\Ap + pA)) 

M) w = V ^— r \ (1.2) 
2Tr [Efpj 

could be uniquely defined as the weak measurement limit of the most general empirical 
conditioned average under certain conditions from a mixed initial state p and an unsharp 
post-selection represented by an arbitrary probability operator (or POVM element) 
E^p . The generalization (II. 2p reduces to the real part of (11.11) for pure states, clarifying 
the origin and significance of the formal expression (II. ip from a broader perspective. 
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Detailed discussion on the derivation was to be saved for a longer paper reviewing and 
extending the full theory of contextual values, which has now also been posted [6] . 

The conditions under which the generalized weak value fll.2jl can be uniquely defined 
as a limit point of a conditioned average have become recently contested in six versions 
of a lengthy arXiv paper [7] and a summary of the same [Hj. The latter presents 
concise proposed counter-examples to the uniqueness of the definition ( II. 2p based on 
the understanding of our work, to which we now reply. The basic issue at hand is quite 
simple: under what conditions can one obtain the result ( 11.21) as the limit point of a 
conditioned average? 

As we explicitly mention in [5] , the conditioned average does not generally converge 
to (jl.2p in the weak measurement limit; indeed, the limit can depend on the details of 
the detection setup, which we call the measurement context. We stress that our result 
( II. 2\\ is thus not in contradiction to the general observation that the weak value is 
not a unique limit point of a conditioned average, which has been previously reported 
[TO] . The sole issues being clarified here are the sufficient conditions for obtaining the 
context-independent special case (11.2j) from the general form of the conditioned average. 

This paper is devoted entirely to the subject of the uniqueness of the definition of the 
generalized weak value and is organized as follows. In section [2] we review some elements 
of the contextual value formalism. In section [3] we analyze a proposed counter-example 
from |8j with the contextual value formalism. In section H] we review the motivation 
behind our protocol for contextual value assignment. This is followed in section [5] by a 
general theorem and proof of our original definition in [5] along with a precise statement 
of the sufficient conditions for our theorem to hold. After discussion of the theorem in 
section [6j we analyze a second proposed counter-example from [9] in section [3 Finally, 
we give our conclusions in section [BJ 

2. Contextual Value formalism 

To keep this work self-contained, we briefly review the contextual values formalism 
introduced in [5] and expanded upon in [6] . The central observation of the contextual 
values formalism is that an observable A for a particular system can be completely 
measured indirectly using an imperfectly correlated detector. The formalism is powerful 
enough to subsume strong measurements, weak measurements, and any strength of 
measurement in between. Indeed, the von Neumann measurement used to derive the 
weak value ( 11.11) originally becomes a special case. 

For the typical case of a detector with a pure preparation state \d) that is coupled 
to the system with any joint unitary operation U s d and then subsequently measured in 
a detector basis such an indirect measurement will be completely characterized 

by a set of measurement operators on the system {Mj = (j\U s d\d)}, which we call 
a measurement context. As in [5], we restrict ourselves to this typical case in what 
follows for simplicity; the straight-forward generalization to impure detector preparation 
is detailed in [6]. 
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When a system state p is conditioned on a particular outcome j of the detector, it 
becomes updated according to pj = MjpMj /P(j), where the normalization probability 

for detecting the outcome j is given by P(j) = Tr {jjE^j. The positive probability 

operators {Ej = MjMj} partition unity ^2jEj = 1, forming a positive operator- 
valued measure (POVM) on the system space. 

The expectation value of the observable A can be accurately measured by the 
imperfectly correlated detector provided that the following operator identity exists, 

1 = J>A (2-1) 

j 

(A)=Tr(pl) =$>,P(i), (2.2) 

3 

which defines the contextual values {<x,} of the observable A with respect to the 
measurement context {Mj}. As we shall explain in section HJ in the event that multiple 
solutions for the contextual values exist we prescribe picking the solution that places the 
tightest bound on the detector variance, which can be found using the pseudoinverse. 

If the observable A also commutes with the entire measurement context 
Vj, [A, Mj] = 0, then all the statistical moments of A can also be accurately measured 
by correlating sequences of measurements on the detector, 

(A n ) = K • • • «*) Tr (p% • • • . (2.3) 

jl-jn 

We call a detector that can measure all moments of A a fully compatible detector. In 
what follows we will concern ourselves mostly with fully compatible detectors. 

For the special case of a projective detector, the measurement context {life} consists 
of the spectral projections of A, so (12. ip reduces to the spectral expansion A = J2k a k^-k 
as a special case, where are the eigenvalues of A, and (12. 3ft reduces to the standard 
formula (A n ) = a^P(k) that needs only a single repeated measurement to obtain 
all moments. Hence, the contextual values can be considered to form a generalized 
spectrum for the observable that is specific to a particular measurement context. 

If a second measurement is made after the first measurement of A that is 
characterized by an arbitrary second measurement context and associated probability 

j(2 

observable, 



operators {Ef^}, we can also construct the most general conditioned averages of the 



f (A} = ^ aj P(j\f), (2-4) 

j 

Tr (EfMjpM]) 

P(J\f) = V ^r- (2-5) 

E, Tr (EfMjpM}) 

The post-selected conditional probabilities P(j\f) are generalizations of the 
Aharonov-Bergmann-Lebowitz rule [3] that handle mixed states, general intermediate 
measurement, and unsharp post-selections. As the conditioned averages f!2.4p are 
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constructed entirely from measurable quantities, they form a principled foundation for 
deriving the generalization of the weak value f ll.2j) as a limiting value as the correlation 
between the system and detector vanishes. 

In what follows, we stress that the contextual values formalism itself has not been 
challenged. Only the details of the derivation of the context-independent weak value 
( II. 2p using the general conditioned average ( 12 .4p are being contested. 



3. Analysis of a counter-example 



We now address the counter-example provided in [8]. A case where the number of 
POVM elements (or measurement operators in this case) exceeds the dimension of the 
Hilbert space for a system observable A is considered therein, 



Mi 



Mo 



M, 




(3.1) 



(3.2) 



where the operators are expressed as matrices in the basis that diagonalizes A. 

To calibrate the measurement, one is then faced with determining contextual values 
that satisfy the (now underspecified) equation (12. ip . To see this in detail, since all 

operators commute and are diagonalized in the same basis, we can write (12. ip as the 

equivalent matrix equation, a = Fa, where F k j = Tr (tl k Ej 




(1/2 + 9? 
(1/2 -gf 



;i/2-4) 2 

:i/2+s) 2 




(3.3) 



\a 3 ) 

This underspecified matrix equation is then solved in [8] by choosing ct\ = 1/g 2 
arbitrarily and then solving the resulting modified equation, 

(1/2 -gf 1/2 -2g" 
(1/2 + gf 1/2 -2s 2 



(l/2+g) 2 
g 2 

(l/2-g) 2 




(3.4) 
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which leads to the full solution, 



Oil 



«2 = 

4- 

«3 = 



2<? 



2gf - 6(1 - 2^) 2 - 16s) 



(3.5) 



1 a 

n + 



V(4<? 2 - 1) 
' + (a + 6-8) + 0(^), 



r 4<7 

which contains poles of order 1/g 2 by construction. These poles then contribute an 
extra context-dependent term to the weak limit of (j2.4jl that is not included in (jl.2p . 
For the specific choice of the identity a = 6 = 1 considered in [8], then ai = «2 = l/^ 2 
and a 3 = (20 2 + l)/(0 2 (4^-l)). 

We devote a considerable amount of space to this type of underspecified case in 
our four page Letter [5]. We write on page 2, "The latter case [where the number of 
POVM elements exceeds the dimension of the system operator] results in an infinite 
number of possible solutions, aij. As such, we propose that the physically sensible 
choice of [contextual values] is the least redundant set uniquely related to the eigenvalues 
through the Moore-Penrose pseudo inverse." All examples we give in the paper use the 
pseudoinverse, and this discussion occurs immediately before the conditioned average 
section under contention. 

The problem with the counter-example is that the pseudoinverse solution is not 
employed, and consequently the freedom in the set of underspecified equations is used 
to insert by hand an anomalous contextual value that diverges as 1/g 2 in order to 
artificially produce an extra contribution to the result (11.21) in the g — weak limit. 
Indeed, we could go further by similarly choosing a contextual value that diverges as 
g~ m , where m > 2. Such a case would produce a formally divergent conditioned average 
in the weak limit. 

However, if we solve for the contextual values using the prescription we describe in 
our paper, the assignment gives a clear physical interpretation to the measurement 
that is being done. The pseudoinverse solution is found from the singular value 
decomposition, F = ITEV T . For this example, we find, 

17=4= 



V 



4g 2 +l V^(4fl 2 -1) 

^/48 9 4 -8 S 2 +3 
4g 2 +l v / 2(4g 2 -l) 

v /48g 4 -8 9 2 +3 ^/48 9 4 -8 S 2 +3 

-2(4g 2 -l) v / 2(4g 2 +l) 

A /48g 4 -8g 2 +3 A /48g 4 -8 9 2 +3 





iv/48(7 4 - 8c/ 2 + 3 



(3.6) 
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The pseudoinverse is then F + = VE + U T , where S + is a diagonal matrix inverting all 
nonzero elements of S T . We can then find our prescribed solution as a = F + a, which 
for this example is, 



a — b (a + b)(Aq 2 + 1) a — b a + b ^. 9 . 
1_ ^ L — 1 l 0(a ) 

4g 48f? 4 - 8g 2 + 3 Ag 3 KU ' 



a-b (a + 6)(V + l) a-b a + b 2 
4g 48g 4 — 85^ + 3 4g 3 



2(o + 6)(l- V) = 2(a + b) 2 
48# 4 -8^ 2 + 3 3 {g ' 



The largest pole in the solution ( 13. 7p has order 1/g, which is the inverse of the 
smallest nonzero order of g in the POVM generated by (13. ip — we will show this is the 
general rule for pseudoinverse solutions that correctly satisfy a = Fa with the lowest 
nonzero order in g. It is then easy to check that the generalized weak value ( II ,2p will 
be recovered from the conditioned average (12.41) in the weak limit as g — > for any pre- 
and post-selection, as claimed. 

For the special case of the identity, a = b = 1, that is considered, the solution 
( 13. 7p does not diverge as g — > 0, but actually converges to a constant. This behavior is 
intuitive because the measured system operator is the identity — the identity can always 
be constructed from the g = POVM alone. In this case, the first two contextual 
values converge to the same value of 2/3, while the third contextual value converges 
to 4/3 and contributes twice as much to the average; this makes physical sense as the 
first two outcomes balance each other to produce the identity, while the third outcome 
directly corresponds to the identity being measured. Moreover, for the orthogonal case 
a = 1,6 = —1 the first two contextual values simplify to ±(l/2g), while the third 
contextual value vanishes entirely; this makes physical sense since the third outcome 
is orthogonal to the operator being measured and can therefore be discarded. None of 
these physically intuitive features are present in the solution (13. 5p presented in [8]. 



4. Pseudoinverse prescription 

It is now worthwhile to review the pseudoinverse prescription, and to discuss its 
methodology and advantages. We recall that the equation we are solving is ( 12. ip in 
the form of the matrix equation a = Fa, where F is an iV x M matrix (N being the 
dimension of the system, and M being the number of POVM elements) given by its 
elements, Fkj = Tr \ flkEjj . We can then decompose this matrix with the singular 
value decomposition, F = U"EV T , where U is an iV x N orthogonal matrix, V is an 
M x M orthogonal matrix, and E is a N x M diagonal matrix of singular values. The 
pseudoinverse of F is then constructed as F + = VT, + U T , where S + is a M x N diagonal 
matrix formed by inverting the non-zero singular values. The pseudoinverse reduces 
correctly to the true inverse if one exists. 

With the pseudoinverse in hand, we then find a uniquely specified solution a = 
F + a that is directly related to the eigenvalues of the operator. Other solutions a = a^+x 
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of (12.1 p will contain additional components in the null space of F, and will thus deviate 
from this least redundant solution. Consequently, the solution <2 has the least norm of 
all solutions, since ||5|| 2 = ||ao|| 2 + ||^|| 2 by the triangle inequality and the fact that 
do and x live in orthogonal subspaces. Even in the case of an overdetermined set of 
equations (where the number of detector outcomes is less than the dimension of the 
system), the pseudoinverse will give the "best fit" solution in the least-squares sense. 
This can be seen by solving F T Fa = F T a. One will also obtain a = a + x, where 
now do does not solve Fa = a, but is the least squares fit to it, and x is in the null 
space of F T F. As a physical example of this last situation one could use a grid of point 
measurements like a pixel array to approximate measurements for a continuous variable, 
such as position. 

In addition to the mathematical reasons for using the pseudoinverse in this context, 
there is an important physical one that we will now describe. As mentioned, a fully 
compatible detector can be used together with the contextual values to reconstruct 
any moment of a compatible observable. However, since the detector outcomes are 
imperfectly correlated with the observable, the contextual values typically lie outside 
of the eigenvalue range and many repetitions of the measurement must be practically 
performed to obtain adequate precision for the moments. Importantly, the uncertainty 
in the moments is controlled by the variance — not of the observable operator, but of the 
contextual values themselves, 

ff2 = E^ P W-(^) 2 , (4-1) 

3 

where P(j) is the probability of outcome j. Since the mean of the contextual values 
is set by construction to the mean of the observable being measured, it is in the 
experimentalist's best interest to minimize the second moment of the contextual values. 
This moment has a simple upper bound of ^2jC^]P{j) < J2j a ] = IMP because 
< P(j) < 1, which will also constitute an upper bound of the variance a 2 . In absence 
of prior knowledge about the system one is dealing with, this is a reasonable upper bound 
to make. Therefore, by minimizing this upper bound the pseudoinverse will choose a 
solution that provides rapid statistical convergence for observable measurements on the 
system given no prior knowledge of the system state. 

For the case of the counterexample in [8], the solution (13. 5 p has to leading order 
the bound on the variance, 

3 3(a - b) 

7 ~ ' " W 

while the pseudoinverse solution (13. 7p has to leading order the bound, 

bf 2 
— + -I 

8g 2 3 l 

For any observable a the solution (13.51) has a detector variance bounded by leading order 
1/g 4 , which could generally swamp any attempt to measure an observable near the weak 
limit. In particular, the conditioned averages (12.41) would not generally be tractable to 



|<S|| 2 = ^—lL + t {a + b f + {f). (4.3) 
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obtain, so the anomalous weak limit derived in [8] may not be easily observable without 
a special initial state. However, the pseudoinverse solution (13 .7p has a detector variance 
bounded by leading order 1/g 2 in the worst case; moreover, for the identity, a = b = 1, 
then the bound on the noise minimizes to a constant as g — > 0. 

5. General theorem 

We now give a general proof of the result (jl.2p . To obtain this result, we make the 
following sufficient assumptions: 

(i) The measurement operators {Mj} are analytic functions of a measurement strength 
parameter g, and thus have well defined Taylor expansions around g = such that 
Vj, lim^o Mj oc 1. This is physically reasonable because measurement operators 
are typically composed from matrix elements of an analytic evolution operator 
under an interaction Hamiltonian for which g is the coupling constant. 

a A ^ 1/2 

(ii) If a measurement operator Mj = UjE- is not positive, its unitary freedom 
Uj = exp(igGj) is generated by a Hermitian operator Gj that commutes with 
the density matrix p of the system, Vj, [Gj, p] = 0. The reason for this assumption 
will become clear. 

(iii) The equality A = atj(g)Ej(g) must be satisfied, where the contextual values 
ctj{g) are selected according to the pseudoinverse prescription. 

(iv) The minimum nonzero order in g for all Ej(g) is g n such that (iii) is satisfied. (In 
[5] we considered the typical case n — 1.) 

(v) The POVM elements {Ej} all commute with the observable A, so that they are 
diagonalizable in the same basis. 

Then we have the following theorem: in the weak limit g — > the context dependence of 
the conditioned average ( 12. 4ft vanishes and the generalized weak value (11.21) is uniquely 
defined. 

We note before we prove this result that these are only the sufficient conditions for 
the unique definition (11.21) that we implied in [5] — some of the assumptions might be 
further weakened. For example, there may be other principled inversion schemes for the 
contextual values that also lead to the context-independent result ( II. 211 . 

To obtain the proof, we shall rewrite (12 ,4p in a useful form and then take the 
weak limit as g — > 0. Using the polar decomposition of the measurement operators 
Mj = UjE- , we rewrite the probabilities that appear in (12 .4p as, 

Tr [EfMjpM]) = Tr [{U}EfUj)p^ , (5.1) 
where the modified density operator is, 

p'j = EfpEf = \{E V p} - ^[E 1 / 2 , [E 1 / 2 , p}}, (5.2) 
and {a, b} = oh + ba denotes the anticommutator. 
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From assumptions (i) and (iv) we have the lowest nonzero order expansion of the 
POVM Ej =pji + g n E^ n) + 0{g n+1 ) where p i e (0, 1) are nonzero probabilities such 
that 2~2jPj = 1- We therefore also have the expansion of the positive roots to the same 
order in g, 



E) / \g) = ^p-i + g n Ef ) /2^p- + 0{g n+1 ). (5.3) 

The probabilities pj must be nonzero to satisfy assumption (i). The physical probability 
of outcome j is given by P(j) = Tr (^pEj^j, and therefore converges to pj in the weak 
limit, g — > 0. 

Inserting the expression (I5.3P into (15.21) . we find 

Pj = PiP + y {Ef\ p} - 9 w \Ef\ [Ef\ p]\. (5.4) 
This leaves the probabilities that appear in (12. 4p to be, 

Tr [EfM^Mf) = Pj Tr ([lj]EfU 3 )p) + £ Tr ((tjj Efu^E^ , p}) , (5.5) 

plus a correction of order 0(g 2n ). 

Invoking assumption (ii), since the generators Gj of unitaries commute with the 
density matrix, the unitary itself commutes with the density matrix. Consequently, 
the first term in the righthand side of (15.51) simplifies to pjTr^E^fiJ. In the 
term of order 0(g n ), we can expand the unitary operator to first order in g, Uj = 
1 + igGj + 0(g 2 ) to find that the second term in the righthand side of (I5.5P simplifies 
to (g n /2) Tr (Ef{Ef\ p}j plus a correction of order 0(g n+l ). 

Thus, we find that the denominator of (12.41) is 



Tr (Efp) + J>72) Tr (e? >{if >, p}) + 0(g n+1 ). (5.6) 

3 3 

However, since 2~2jPj = 1 an( i 2~2j Ej = (the POVM condition), the denominator is 
simply Tr ^E^p^j, with a correction of order 0(g n+l ). 

The numerator of f!2.4[) is given by summing (15.51) with the contextual values to 

find, 

5>,Tr hf\\{pA + g n Ef\p})\ +0(g n+1 ). (5.7) 

3 

We note that to order g n , A = £\ aj(5f)(pjl + g n E^)] since this sum exactly appears 
in the numerator, we recover our original result (11.21) . up to a numerator correction of 
order 0(g n+l ) times the order of each aj. Thus, the only way the result (ll.2p can be 
spoiled under our assumptions is if ctj{g) has a pole larger than 0(l/g n ). Hence, the 
last step in the proof will be to show that the pseudoinverse solution of aj(g) cannot 
have a pole larger than 0(1/ g n ). 

To address the order of the contextual values aj(g), we will first simplify notation 
by noting that A commutes with {Ej(g)} according to assumption (v). As such, we 
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will replace all the diagonal matrices with vectors and rewrite the contextual values 
definition (12. ip from assumption (iii) as an equivalent matrix equation, 

a = Fa, (5.8) 

where, 

F=(E 1 (g) E 2 {g) ...)=P + g n F n + 0(g n+1 ), (5.9) 
and the two leading order matrices are defined as, 



P = ( Pil ThS ■ ■ 

F n = ( E[ n) Et ] 



(5.10) 

As discussed, the minimum norm solution to (15. 8p is the pseudoinverse solution 
do = F + a. The pseudoinverse is constructed from the singular value decomposition 
F = UTjV t as F + = VE + U T , where U and V are orthogonal matrices such that 
U T U = VV T = 1, £ is the singular value matrix composed of the square roots of 
the eigenvalues of FF T , and S + is composed of the inverse nonzero elements in S T . 
In order to satisfy (15. 8p . then we have the equivalent condition for each component of 
U T a = EV T a, 

(U T a) k = Y, kk {V T a) k . (5.11) 

Therefore, all singular values T, kk corresponding to nonzero components of U T a must 
also be nonzero; for brevity we shall call these the relevant singular values. Singular 
values which are not relevant will not contribute to the solution a = VT, + U T a. Since 
ctj = (VT, + U T a)j = ^ k Vj k T^ k {U T a) k) any zero element of U T a will eliminate the 
inverse irrelevant singular value from the solution for aj. 

Since the orthogonal matrices U and V have nonzero orthogonal limits lim g ^ U = 
Uo and lim 5 _>o V = Vo, such that U^Uq = VqVq = 1, and since a is ^-independent, 
then the only poles in the solution <3 = F + a = VE + U T a must come from the 
inverses of the relevant singular values in S + . If a singular value Y, kk = 0(g m ), then 
^fcfc = = 0(l/g m ); therefore, to have a pole of order higher than 0(l/g n ) then 

there must be at least one relevant singular value with a leading order greater than 
g n . However, if that were the case then the expansion of F to order g n would have 
a relevant singular value of zero and therefore could not satisfy (15. lip , contradicting 
the assumption (iv) about the minimum nonzero order of the POVM. Therefore, the 
pseudoinverse solution d?o = F + a can have no pole with order higher than 0(1/ g n ) and 
the theorem is proved. 

6. Discussion 

As we stated in our Letter [5], "we find that as g — > 0, the weak limit [of (12. 4p ] generally 
depends explicitly on {Gj} and {<x,-}, and thus will change depending on how it is 
measured and how the [contextual values] are chosen." These dependences are apparent 
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in the proof from equations (15. ip through (15. 7p . In other words, we find that the weak 
limit of the conditioned average is not generally unique. However, to produce any 
limiting value other than (II. 2p one needs to violate the sufficient conditions given for 
our theorem. Namely, to find a different weak limit one needs either a nonanalytic 
or incompatible measurement context, a unitary disturbance that persists in the weak 
limit, a minimum nonzero order of g that does not satisfy the observable identity, or 
pathologically chosen CV. 

In (15. 2 p the positive root of the POVM element Ej performs the information 
extraction of the measurement and modifies p to p'-, which consists of two terms: a 
symmetric term involving the POVM element itself, and a double-commutator involving 
the roots. The symmetric term leads to the weak value (ll.2p . while the commutator term 
produces measurement disturbance away from (II. 2p . The unitary part of the POVM 
element in (15. ip rotates the post-selection E^p to a different post-selection that depends 
explicitly on the measurement result obtained, so this also disturbs the measurement 
process independently of the information extraction of the measurement. 

For this reason, we consider a measurement consisting solely of positive POVM 
roots to be a minimally disturbing measurement, which is consistent with the usage 
of the term by Wiseman and Milburn [TTJ. That is, the information extraction of 
the measurement necessarily disturbs the system state by a minimum amount, but no 
additional unitary rotation occurs. Note that a weak measurement is an independent 
concept from a minimally disturbing measurement. 

In j5] we named the limit as g — > under the sufficiency condition Vj, [Gj,p] = 
placed on the unitary generators Gj the minimal disturbance limit since the 
measurement operators act like minimally disturbing POVM roots in that limit. The 
minimal disturbance definition Uj = 1 becomes a special case. 

7. Analysis of a second counter-example 

Shortly after a preprint for this paper was posted, the reference [8] was updated to a sixth 
version [9] that adds a second proposed counter-example to our theorem, which we now 
address. The second proposed counter-example uses a three-outcome POVM to measure 
an observable in a three-dimensional Hilbert space to avoid any ambiguity related to 
the contextual values being underspecified. Specifically, the following measurement 
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operators and observable are employed, 
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where we have corrected a minor typo in the definition of M 2 Computing the 
contextual values required to satisfy the relation A p = Yl<i a iMf produces, 
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The 1/g 2 dependence of the contextual values can lead to the conditioned average ( 12.41 ) 
having additional context-dependent terms beyond the weak value (11.21) that are relevant 
in the weak limit, which seemingly contradicts our theorem. 

This example, however, violates sufficiency condition (iv) for our theorem. 
Specifically, to first order in g — which is the lowest nonzero order — the POVM elements 
are, 

\ 

o 
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1/2 


( 1/3 
1/3 + 5 






1/3 ) 



(7.4) 



££ = (1/6-0)1 



While these first order POVM elements do satisfy the POVM condition E[+E' 2 +E' 3 = i, 
there is no exact solution to the required identity A p = ^ aiE[. 

% The missing square root over the 1/3 that is needed to satisfy the POVM condition and obtain the 
contextual values (17.31) has been restored. 
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We can see this fact by first noting that there is no solution for an arbitrary 
observable A. Specifically, if we write the required identity as a matrix equation with 
a = Fa, with, 



1/6 
1/6 
1/6 



\ 



9 
9 

9 ) 



(7.5) 



( 1/2 + g 1/3 
F= 1/2 1/3 + g 
\ 1/2 + s 1/3 

then F _1 does not exist since det(F) = 0, so there is no general solution a = F 
However, there may still exist specific observables a' for which a! = Fa is an 
underspecified system of equations with an infinite number of valid contextual value 
solutions. To rule out such a case for the specific observable a p = (1, 0, 0), we compute 
the pseudoinverse solution a = F + a, 

(7.6) 
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and subsequently compute, 



F(F+a, 



( 1 \ 



V 1 / 



(7.7) 



Since this does not equal a p = (1,0,0), then a p is partially in the nullspace of F + and 
there can be no exact solution to the required identity A p = £\ a iE[. 

Therefore, sufficiency condition (iv) for our theorem is violated and we do not expect 
the theorem to hold. Intuitively, the measurement (17. ip is not sufficiently correlated with 
the specific observable A p as g — > to guarantee the weak value (11. 2p as the limit point 
of the conditioned average (12 .4p . 

Moreover, if another observable A could be found such that A = £\ aiE[ were 
satisfiable to first order in g by the pseudoinverse solution, then the discussion after 
(15. lip in the proof of our theorem would apply. Hence, higher order poles would not 
appear in the contextual values, and the generalized weak value (11.21) would be obtained 
as the unique limit point of the conditioned average (12. 4p . 



8. Conclusion 

We have expanded upon and defended the claim made in our Letter [5] that the context- 
independent generalized weak value (II. 2p can be uniquely defined as a limit point of the 
conditioned average (12.41) . and have given sufficient mathematical assumptions required 
for the definiton to hold. Conceptually, the measurement context should depend on a 
measurement strength parameter g such that it reduces to the identity as g — > 0; any 
additional unitary disturbance in the measurement should not affect the state above and 
beyond the measurement being performed; the observable should be measurable to the 
lowest nonzero order in g; the contextual values of the measurement should be chosen 
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to minimize an upper bound for the detector variance; and, the probability operators 
for the measurement should commute with the observable. 

We have also addressed two counter-examples to our definition that were proposed 
in versions 1 through 6 [§| of an arXiv post [8j [9] . In the former example our prescription 
for constructing contextual values in the case of a redundant detector (or underspecified 
measurement context) was not employed, and an anomalously divergent contextual value 
was inserted by hand; when our prescription for assigning contextual values is correctly 
applied, our theorem holds and a clear physical interpretation can be given to the 
measurement. In the latter example a measurement context was chosen that cannot 
construct the desired observable to the lowest nonzero order in g, so our theorem does 
not apply. Addressing these examples further demonstrates the power and utility of the 
contextual values formalism. 
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